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ABSTRACT 


Block diagram schemata model computation systems in the context of an 
external environment. The environment imposes various constraints on the real-time 
performance of any implementation of a biock diagram schema. The model is used 
to provide precise definitions of real-time performance. The portion of the 
implementation that affects the reaktime performance is called the control 
structure. 


This research Investigates several strategies for synthesizing control structures 
to satisfy the external realtime specifications. The simplest strategy is to 
execute all the biocks in the diagram in some fixed order. Control structures of 
this type have been somewhat ignored for time critical appiications. The synthesis 
problem is shown to be solvable In the sense that acyclic control structures do not 
need to be considered. A branch-and-bound synthesis algorithm Is presented which 
requires exponential time in the worst case. Although no efficient synthesis 
algorithm was found, the conjecture that the problem Is NP-complete is not proved. 


The other strategy for implementing control structures makes use of the fact 
that in some applications the input values change at discrete times. Under this 
assumption, block diagram schemata are similar to traditional models of real-time 
computations. An efficient algorithm for assigning fixed priorities to independent 
tasks is presented that guarantees the reattime specifications will be met. This 
algorithm relaxes previous restrictions of the deadiine for a task being coincident 
with its next request. 


Finally, some of the issues Invoived with muitipie processor control structures are 
discussed, although no specific algorithms are investigated. 


Key Words and Phrases: reattime scheduling, priority “scheduling, deadiine-driven 
scheduling, control structures Be ithe Satie kts 
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Real-Time Control Structures for Block Diagram Schemata 


1: : introduction 

There are many applications for computers where the reabtine performance of 
the program Is critical. These applications, all Involve _asynghrondus Interaction with. 
the external environment and It is this environment that. imposes the reartime 
- Feeibietioni. “Bor example, davice ‘drivers’ In. operating ‘systems must respond. to 
Interrupts before the information is lost.. Another application. is in‘ direct digital 
control and process monitoring. . 

However, most high-level languages are not designed for producing time critical 
programs. The languages allow the user to define appropriate functional and data 
abstractions for his problem, but have no notion of reak-time or asynchronous 
interaction with the real world. instead, the user must design a control structure 
for his problem suitable for a single sequential process that will satisfy all the 


reattime constraints. 


1.1: Previous Work 

Many operating systems do have notions of reattime and external Input and 
output, but they are supported at a fairly low level [19, 20]. The application 
program typically has to deal with priorities, setting realtime alarms, and responding 
to interrupts. These actions may be necessary to satisfy the constraints, but they 
do not bear a close relationship to the constraints. For example, it is seldom 
obvious what priority must be assigned to a task that must complete in ten 
milliseconds and uses one millisecond of CPU time. 


Early work on applications oriented reattime operating systems was done by 
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Fiala [5]. Fiala proposed a model of reattime processes characterized by three 
parameters per process. 


(1) P; the maximum CPU time used by process /. 
(2) BD, the-maximum delay allowed from: the time process / requests service 


_ to the completion of servicing that request. 


(3) © TY; the minimum period between requests for process /._ 


Fiala proposes three scheduling algorithms for ave: scan: The first (and 
simplest) executes the process that must complete the soonest. le. the process 
with the earliest deadiine. This algorithm is cereal in the sense that if any 
schedule satisfles the deadline requirements. for ‘all the processes, so does the 
earliest deadline schedule. " However, this result is proved ‘in the context “of 
process awitoticig raquiring negligible overhead. 

_ Fiala’s second algorithm is a modification of the earliest deadline scheduler that 
minimizes the number of process switches while retaining the optimality condition of 
the earliest deadiine algorithm. This is accomplished by having the scheduler check 
to see if the current process must ‘be preempted when a process with an earlier 
deadiine requests service. This is done by ‘simulating: ‘the action of the scheduler 
on the current requests. Unfortunately, this algorithm ‘would require ‘eiteneive 
computation whenever a process requests service. ‘Accordingly, Fiala’s third 
algorithm pre-computes a lower bound on the expression required by the minimum 
switching algorithm. With the lower bound, ‘the extra ‘computation required by the 
third algorithm requires an extra comparison at process request time. The algorithm 


Is ‘also optimal in the same sense and jcuukes ise, overhead than the simpler 
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earliest deadline algorithm. 

However, Fiala makes no attempt to integrate his model and scheduler Into @ 
real-time language system. One such approach Is contro! robotics developed by 
Dertouzos [3] and Geiger [6]. A controf robotics: program is organized as a set of 
daemons which continuously monitor some condition and execute the body (a 
esrractive procedure) when the condition. Is true. The reattime. specifications | for a 
daemon are the delay nomi when a condition becomes true to when ms progam 
detects the condition (the setsgittion time) and me delay from detecting a 
condition and executing the body (the peeronee time), Gelger's implementation of 
. control robotics Periodically samples the cancion with | a period iy less than 
the recognition time (the slightly higher rate wil allow for spepome tion. by other 
daemon conditions). The daemon noice are scheduled vende an eattest deadline 

scheduler. 
| _ One weekness oF control robotics Is that no ‘guarantee of satistying the reer tine 
. constraints is made at compile time. This could be done if the user seclared a 
minimum period between executions of « a daemon body and bailar 2 compter 3 determined 
the computation time of the daemon bodies. Since it is Impossibie to tetermne the 
computation time for an arbitrary procedure, the compiler may Lohrhe declarations 
to determine the computation time. 

A more substantial ee of Gelger's implementation Is the assumption Mat the 
eonditione for Reems: are independent of the execution of « other daemon bodies. 
Therefore, complex suctores of eaonons whose concetone aecead:< on variables 
pranees PY other daemons could result in muck Ee ceeeery. computation All ne all, 


control robotics does not ceauade: any more of a model for real-time prigranelha 
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than Fiala’s work beyond suggesting some.. syntax for identifying tasks and 
specifying thelr deadiinas. a 

Another system that deals with real-time specifications at. the user level is 
TOMAL (Task Oriented Microprocessor Language):{12]... On-the surface TOMAL is a 
combination of a modern block structured programming language and a typical mint 
computer ‘realtime’ operating system. However, in addition to assigning static 
priorities to tasks, a response time may be specified for a task. This response 
time is. similar to. the recognition time for control.robotics aad specifies the maximum 
delay between a request for a task activation and the initiation. of that task. 
Another feature of TOMAL is thet interrupt routines only request task activation and 
do not respond to the interrupt in any substantive way. This. reduces the amount 
of object code that does not run under the task scheduler. and. allows the TOMAL — 
system to check the consistency of the.real-time constraints.for the entire system. 
However, TOMAL makes no attempt to verify reattine specifications on service 
times for tasks. 

Data flow schemata deserve mantion as a reabtime system since one proposed 
applications is digital signal processing [2, 22]....!t-is designed to facilitate highly 
paralle! computation and statements may be executed. as soon as all thelr Input 
variables have been computed. If several statements. are executable an arbitrary 
statement is chosen. However, with the addition of realtime. constraints to mediate 
this decision, data flow would be powerful real-time system. The other major. 
drawback of data-flow is that Is not suited for implementation. on conventional 


computer architectures. 


1.2: Statement of the Problem 

The goal of this research Is to develop theory that is applicable to the 
Imptementation of a programming system designed to the restricted domain of time- 
critical applications. The main criterion of the ‘sultabfity ‘of ‘the language to this 
domain. should be that small changes In the reattime specifications should resutt in 
small, obvious changes In the source” program. itis conceivable, and indeed 
desirable, that these changes coufd have a dramatic effect on the object program 
produced. This reorganization of the object program Is precisely the process that 
should be automated. 

Conventional languages already provide facifities for functional and data 
abstraction, and numerous researchers are already working in this area. Therefore, 
this research will focus on the globai contro! structure for programs. This Includes 
issues such as the number of processors to use in an implementation, deciding 
what interrupt structure (if any) Is necessary, decomposing the program Into tasks, 
and assigning parameters required by the appropriate task scheduler. 

Since normal language semantic issues are being avoided, the description of a 
program can be made extremely simple: The intuftive model for a‘real-time program 
Is that of continuous time analog block diagrams. The graph defines a precedence 
relation among operators Identical to the data flow in the diagram. The program will 
be specified as a directed graph of actions to be perforthed and their functional 
dependence, with arcs of the graph representing data paths. The graph must be 
acyclic since cycles ina biock diagram represent feedback systems. Automatically 
producing an object program that solves the feedback equation would require more 


detailed semantics for the programs as well as other disciplines outside the scope 
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of this research. However, in sonie. special cases, cycies can: be handles by 
rearranging: the biock ‘diagram. A strict upperbound must: be placed. on . the 
computation time required for each action. The reabtime constraints specify upper 
bounds of the propagation delays through the. block: diagram and of the bandwidths 


of the input. and output. signais. 


1.3: Thesis Overview : 

_ Chapter 2 develops ne block diagram mode! of computation The block diagram 
model Is a progtes schematic model simitar to data sas: However, reak-time and an 
ésternal anwieonnicat are enn in the model. tn 1 addition, the block diagram model 
separates: the  date-fiow of the schema from the ecntiel flow, which is embodied in 
the Génival structure. The contro! structs: apeolfies the exciton sider of the 
blocks at object time. The research problem may be formalized as finding control 
structures for block diagram. schemas which satisfy the given reattime 
specifications. The major use of the model Is to define the semantics of the real- 
time specifications. . 

Chapter 3 investigates various static contro! structures (control structures that 
are independent of the data values at object time). Although static control 
structures may be uséd widely in specific applications (particularly in small, 
dedicated systems such as those implemented on microcomputers), they have been 
Ignored by designers of real-time Programming systems, mainty because their reat 
time performance in the general case has not been studied. 

Chapter 4 investigates extended semantics where the external inputs do not 


change continuously. In this situation, a dynamic control structure may be used. A 
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dynamic control. structure is a control structure that does depend on the data 
values at object time. The chapter investigates a subclass. of dynamic control 
structures, namely static priority interrupt control structures. The prototypical 
example is an interrupt system where the system does nothing until an input 
changes, although it includes systems without physical interrupts where the inputs 
are sampled. The priorities are static as opposed to the earliest deadline 
scheduler where the priority of a task is a function of time. 

Chapter 5 discusses some of the Issues that arise when more than one 
processor is available for the implementation. : The sacrtes performance of 
multiprocessor systems are analyzed ane the Peete performance of : bicek 
diagram schema is bounded. Some techniques for distributing the processing poser 


several processors are suggested, although specific algorithms are not studied. 


2: Biock Diagram Schemata 

Most models of computation do not capture the notion of a “real-time” system 
which monitors continuously changing inputs from some external environment. Block 
diagram schemata model the external environment explicitly and recegnize the 
existence of reattime specifications placed by the environment. on -the computing 
mechanism. They are based on the Intuitive model of the conventional analog: block 
diagram whose inputs and outputs are changing continuously. An (m,n) block 
diagram schema consists of an (m,n) block diagram modufe,. a control structure, a 
configuration and an environment which ‘manipulates the: configuration 
asynchronously with the control structure. Within the model, it is assumed that 
values change continuously. Obviously, the computations ‘cannot be «performed 
continuously on a digital computer: The reattime specifications determine how 
often the control structure must compute new values, as welt:as how fast it must 
compute them. 

An (m,n) block diagram module is a directed graph whose nodes are either 
blocks ot links. The terms predecessor and successor wilt be used with the 
conventional defiritiors. Data is stored in the links while the biocks perform the 
actuea! computation. Accordingly, only one are may point to each link. The graph 
must be proper in the sense that -ares may: not=point from links to links or from 
blocks to blocks. Uppercase letters will be used to denote biocks and lower case 
letters to denote links. Thé predecessor of a fink is called the specifier of that 
link and the successors of a link are called the -watchers of the link. The 
predecessors and successors of a block are catted the inputs and outputs of the 
block respectively. | 

“An (m,n) module has m links with no input arcs {inpet links) and-n inks with no 


output arcs (output links). The Input links recatve thelr values from an external, 
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continuous time function called the jnput signal. The values at the output links 
define an external, continuous time function called the output signal. 

The model assumes. the existence of a global clock which defines the passage 
of real time. Hewitt argues against the use of global clocks since they cannot be 
implemented in distributed systems [9]. While Hewitt’s objections against global 
clocks are valid, assigning tines within Hewitt’s framework of local orderings would 
be more complicated. This complexity Is unnecessary. since the events being timed 
are always ordered by one of Hewitt’s local orderings. | 

A configuration is an assignment. of tokens to the links of a schema. The token 
contains a value and a set of labels of the form. (ink, birth). These. labels Indicate 
when the token arrived at the input link J/ak. Each link always contains some 
token, since signals are always defined in a continuous time biock diagram. 

The computation of a block diagram schema is described. by a. series of 
snapshots. A snapshot consists. of a block diagram. module and en. associated 
jarticbuiea: The initial snapshot assigns null values to -all tokens except for 
tokens on the input links of the. schema which are assigned the. current value of 
the Input signal. The label set of all links Is initialized to {(lak, 0)}. The 
computation: proceeds from.one snapshot to the next through the Sriag of blocks. 
The control structure is the strategy for choosing. which block to fire next. The 
fred block accesses the tokens on its input links,-and replaces the tokens on its 
output links. The label set for the output token. becomes the union of the. old label 
set of the token and the label sets that were assigned to the tokens on aij the 
input links of the block. The time in the label (4,1) for the link / at each. input arc 


of the fired block is replaced by the label (/.t/me), where time is the current 
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contents of the global clock. This action occurs after any tokens have been 
replaced on the output links, but the. time for the new label sets Js immediately 
after the input tokens were accessed. !n addition, if / is an Input link, its value is 
set to the current value of the input signal. The black need. not replace any output 
tokens. This differs from data flow since tokens.are_ not. removed from the input 
links after a biock is fired. The data flow restriction Is not appropriate since the . 
value of a token Is defined at all times. 


The amount of computation time used by block A is denoted ty: If the control 


structure fires block A on some processor at time ¢t, that processor will complete 


and replace the output tokens on that block. by the time t+t,. The computation 


times. used will be upper bounds either computed. by whatever. language processor 


ls used to create the primitive blocks or declared by the:user. . 


2.1: Real-Time Performance and Specifications 

A block diagram schema is an approximation toa continuous time block diagram. 
There are many factors affecting the quality of ‘the sci mtmation: However, the 
factors Influenced by the: control structure are how tong the schema takes to 
compute the values of output tokens from the input tokens, and how often it 
performs these sbiaeinanibiea The reattime ‘specifications will place bounds on 
these quantities. A control structure that satisfies all the real-time specifications is 
calted a feasibie control structure. 


The age of a token with respect to a link / at time t Is defined as t-to if (i, to) 
is in the label set of the token, and undefined otherwise. The /atency between 
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links a and b is denoted lap and.is the upper bound of the. age at any. time. of 
tokens at b with respect to link a. The user can specify an upper bound on the 
latency between two links. The first link wilt b@ ‘an input'link of the schema and 
the second link will be an-output fink. 

Latency -specifications can also be expressed in terms of continuous-time 


functions: 


B(e)=Flalt-ae)), «+, As, RD 
Here b(t) is the function whose value is the value of the ne at link b at time ¢; 
a(t) is the function whose vatue Is the signal at link a at time t; A(t) corresponds 
to the age of the tokens at link b. Notice that A(t) is generally not constant, but 
is bounded. The user knows how close 6(¢) must be to b(t) =F(a(t), ---). Using 
information about the magnitude: of £ and a and their derivatives, the user can use 
equation (2-1) to calculate the latency specifications necessary to achieve the 
desired accuracy of b(t). 
The other measure of reattime performance is how often new values are 


computed. The bandwidth from link a to link b (notation Ba») is the maximum rate 


at which the control structure must.compute new values at-b from values: at a. The- 
bandwidth specification is not easily expresaible in. terms of continuous-time 
functions. It may be thought of as a requirement on how. often the value of b(t) 


must change. 
The bandwidth specification may seem _  superfuous since the latency 


specifications also implies -how ofter the value ot &(t) ‘changes. However, it is 


possible for a multiple processor control structure to exhibit bandwidth performance 
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that exceeds the rate implied by the latency specification. An example is shown in 


figure 2-1. 


ty = 10msec 


tp = 10msec 


Bae = 75/sec 


I ac = 40msec 


A block diagram schema requiring a multi-processor control structure 
Figure 2-1 

In this example, both A and 8 require ten milliseconds of computation time. A 
single processor control structure that executes ABABAB --- can guarantee a 
latency from a to c of forty milliseconds and a bandwidth from a to c of fifty per 
second. However, if processor one executes AAA :+-- and processor two 
executes BBB ---, then the latency from a to bd is still only forty milliseconds but 
the bandwidth increases to one hundred. 

While the block diagram model is useful for defining performance for real-time 
programs, it does not yield many insights into the problem of synthesizing a feasible 
control structure. The graph itself resembles a partial order on a set of tasks, but 
the semantics of block diagram schemata are not as restrictive as this partial 
order. in most schematic models, a task must not be executed until all its 
predecessors have been executed since (presumably) it would not have data 


available at all its inputs. The block diagram model has no such restriction and as 
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a result is able to execute some parts of the.schema. more often than-other parts. 

On the other hand, there are certain execution orders that can be ruled oie 
since they are obviously inefficient. For example; once a block has been fired, it 
need not be fired again until one of its predecessors has been fired again since all 
its inputs will be unchanged. Therefore, it outputs will not change. Similarly, if no 
successor of a block A is fired between firings of A, the previous execution of A 
was unnecessary since no block tooked. at the previous values of the tokens on 
the output links of A. 

If these restrictions are combined, each firing of a block must be surrounded (in 
time) by at least one predecessor and at least one successor. Equiveienty, the 
allowable execution sequences may be found by shufing | all the paths from an 
Input link to an output fink. These paths wii be referred to as constraint paths or 


just constraints. 


2.2: Functionality of Blocks 

The semantics of block diagram schemata, mee some useful block functions 
awkward to implement. For example, a block that pertores differentiation ts 
essential for applications in reat-time prqcess monitoring vane control. In classical 
direct digital control, the system is discretized by sampling ak some apecitie period. 
Differentiators are replaced by unit delays and the feedback gains are adjusted 
appropriately. This is possibile only because the shpat are sanpied at a known 
frequency. 


In block diagram schemata there is no guarantee of periodic execution. The 


bandwidth specifications set a lower bound on how often a block must be 
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executed, and a different lower bound. may be implied. by the latency specifications. 
They do not place any upper bound on how often the block.is executed. 
Therefore, it is impossible to tell a prior] when. and: how often a dilack will be 
executed. This would seem to rule out. any. blocks. that would require state 
variables, but this is not true. A white:noise generator could be impiemented using 
a pseudo-random number generator. This would use-a state: variable, but it would 
not run into any problems by not knowing: how often it is executed. But most other 
functions ‘that need to produce or transforma time-dependent sequence of values 
will be Impossible to implement. 

The only general solution to the problem is to have a realtime clock as part of 
the system. Then a differentiation block could remember both its previous input and 
the time it was last executed and compute the obvious first order approximation. 
The major difficulty Is that the reabtime ‘clock would have to provide much finer 
resolution than the 60 cycle clocks found in typical computer systems. 

The user should be able to define: his own time dependent functions since any 
selection of primitive biocks will probably turn out to be too limited for some 
application. Therefore, it becomes Necessary to provide some primitive blocks 
.which would probably lead to jonsensical programs if used carelessly. in particular, 
If the user had a unlit delay block and access to the real-time clock he could define 
arbitrary approximations to differentiators, although undisciplined used of the unit 
delay block would result In useless programs. 

Implementing integration would still be a. problem since the block diagram for a 
first order Integrator would contain a cycle (see figure 2-2). The problem with 


cycles is that it is unclear whether the cycle represents use of a state variable, 
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as in data flow, or implied solution of simultaneous equations,:as in continuous time 
block diagrams. In the case of integrators: it is. clear that the cycle represents use 
of a state variabie, since the cycle contains a unit delay. block. In this case,: the 
cycle can be broken at the Input to the delay. block. The delay block is treated as 
& watcher of link e, even though it gets its input. from tink f. This transformation 
aitars the order in which the blocks: are ‘executed by changing. the constraint 
paths. Unit delays were handled by a. sim#ar transformetion in BLODI [11], a 
system for simulating ‘disctete time block diagrams, and..would be handled in the 


same way by a programmer [21]. 


time 


A Biock Diagram Containing a Cycle 
Figure 2-2 


Example Section. 2.3 


2.3: Example 

The interaction between the reattime: specifications and the contro! structure 
can be illustrated by a series of examples... in these examples the block diagram 
module is left unchanged while the latency and bandwidth specifications are varied. 
These variations: will necessitate changes: in the control structure used to 
implement the block diagram schema. The block. diagram module itself is shown in — 


figure 2-3. 


Typical block diagram schema 
Figure 2-3 


_The simplest control structures to consider are cycles that repeatedly execute 
the blocks in some fixed order. There 3! (= 6) ways of executing four blocks once 
per cycle (ignoring starting transients). For a smail example like this it is feasible 


to enumerate all such cycles and test them to see if they satisfy the latency 
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constraints |. All these contro! structures are independent of when new tokens 
actually arrive. The worse-case assumption is that a new token arrives Immediately 
after the previous token Is marked old.” This assumption Is used in caicutating 
worst-case latencies, which are shown.in figure 2-4: Notice that although ABCD is 
better than ACBD and ADBC is better :than ADGB, there ‘s no best controt structure. 
In fact, we can choose tatency: specifications such that-only one of the control 
structures will work. The first six control structures in figure 2-4 sample the Inputs 
once per cycle, i.e. once every 30 time units. However, if any of the bandwidths 


8B B af oF Ba f is greater than 1/30 then some other contro! structure must be 


a,c’ 


used, 


46 
55 
60 
60 
50 
45: 
56 
66 
40 
7& 
65 


Latencies for static control structures 
Figure 2-4 


A slightly more complicated class of control structures is cycles where some 


blocks may be executed more than once. For example, the control structure 


1. However, such an. algorithm is not practical since the computation time taken by 
such an algorithm would grow exponentially with the number of blocks. 
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ABCABD has worst-case latencies as shown in figure 2-4. This control structure will 


satisfy Its bandwidth constraints if B, e is less than one every twenty time units 
and B, f and By f are less than one every forty-five time units. 


The next class of control sraceies to consider are dynamic control structures 
with static priority scheduling. These control structures make use of the current 
environment to determine which blocks to fire next. The dynamic control structures 
assume that the values of tokens at input links do not change continuously. When 
the value of a token at an input link changes, a request is made for a set of tasks. 
The request is serviced by firing a fixed sequence of blocks as specified by the 
task. Since the processor is generally busy when a request occurs, the requests 
are remembered until the processor is idle, when one of the requested tasks is 
selected to be executed. Each task is assigned an integer priority. The task with 
the highest priority Is serviced next. The scheduler is static since the priority for 
a task is always the same relative to other tasks.. The earliest deadline scheduler 
is an example of a dynamic priority scheduler, since the priority of a task depends 
on its current deadline. If the task being serviced can be temporarily suspended, 
the contro! structure is preemptive. 

A dynamic contro! structure need not be interrupt driven. For example, the 
control structure could sample the inputs between executing blocks. However, 
preemptive control structures cannot be implemented without interruputs. 

In the example of figure 2-3, there are many ways to construct tasks to be 
requested by changing inputs. One such task system is to fire ABD (or ADB) when 
the value at a changes, and CD when the value at d changes. The worst case 


occurs when the values at a and d change simultaneously. The latencies for this 
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case are shown In figure 2-5. These fatencles can be sustained only if the 
bandwidths at a and d are both less than once every 35 time units (otherwise the 
control structure would fall behind). in a sustained. worst .case, new. tokens arrive 
once every 35 time units. A trace of block firings would seem to indicate that the 
static control structure ABDCD is being sienuted which has latencies 15 to 20 
units larger than those for the dynamic control structure. However, in the dynamic 
case it is known exactly when the input signal Hhanoa: In pe Aliaurs the processor 
will be idle if more than 35 time units elapse between a change in input signals, so 
the processor will be able to respond to a change ievaadlately: in a static control 
structure, the change would not be responded to until the control structure gets 


around to it. 


Latencies for dynamic control structures with static schedulers 
Figure 2-5 


3: Static Contro! Structures 

The main function of the control structure in a schema is to specify when to fire 
each block. If the control structure is independent of the configuration (i.e. 
unaffected by changes made by the environment). it is a static control structure. 
An example of a static control structure is.a loap which fires all of the blocks in 
the schema cyclically. Control structures which make. use of configuration (e.g. via 
interrupts) are called cynente control structures. 

The latency specification from a to b will be satisfied sane if all the blocks along 
all paths from a to b are fired at least once during each time interval of duration 


1, time units. Otherwise there would be time intervals longer than 1 When the 


@label at b will not change and therefore the age with respect to a of the token at 


b will be greater than I, p: Similarly, the bandwidth specification from a to b will be 


satisfied if and only if the interval between firing the blocks along the constraint 


paths Is less than 1 / (Rab 


For single processor control structures it is possible to construct a trace of the 
blocks that are fired by the control structure. The trace is a string over an 
alphabet Z whose elements correspond to the blocks of the schema. Each element 


_ Aof Z is assigned a weight (notation jA| ) equal to ty. The weight of a string is 
defined to be the sum of the weight of its elements. A string 5, contains S 2 if alt 
the elements of So appear in § 1 in the order they appear in So. For example, the 


string ABCDE contains the string BD, even though BD is not a substring of ABCDE. 


Regular expressions will be used to denote sets of strings. In particular, if Sisa 


string, s* denotes the set of strings S, SS, sss, --> as well as the empty string. 
It is neccesary. to model intervals in continuous time of ey origin and 


duration, since the latency specifications require ail intervals of apectiic duration to 
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contain the corresponding constraint path. Therefore the weight of the Initial and 
final elements of a string may be ‘counted at fess ‘than their nominal weights: For 


exampie, If |a,---4,|=w (weighting a, and a, at ja,| and Ja, |), then 
[a,- “a is a string of weight less than w since both a and aq are weighted 
at less than Ja,] and fa, |. a belied if the intial cr final stements do not have 


full weights, the may not be included (88 part of any contained en Weighting 
these elements at less than their full values corresponds to shrinking an interval of 


size w in continuous time: if the interval starts after a, starts executing, then the 
Interval does not contain a, reading Its inputs. A,string will be preceded by a i i 


or followed by a ‘TJ if the first or last alenent In the string is weighted at less than 
Its nominal value. 

UA single PLOCe SSO: static control structure is completely specified by its trace, 
which Is determined at compile time renee me name static control structure). ‘The 
reattime qpeaicetoe, on the control structure can be rephrased aa constraints on 
its trace. In ‘particular, the latency specification pee a to > b keg satisfied if ana only 
If all the constraint paths from a to b are contained i in 1 every edbeblng in the a 


The bandwidth specification is satisfied if and only if the weight of 


of weight I, ,- 


all substrings between occurrences of the constraint paths are less than 1 / Bap 


At this point it is possible to deal exclusively’ with the trace of the contro! 


structure and the constraint paths. Constraint path / will be denoted Cc, with 
latency specification I, and bandwidth specification op If c, is a path from a to b, . 


LF =lep and B; =B, »- tt will also ne necessary to deal wir the tails of the 
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conetraint paths. If Cc, - C1 4% ,2 mate Cin ,» where co; js a the jth tail of Cc, is 


sd ied hd es nd Fo | | 

Since the control structure must satisfy the real-time specifications for ail time, 
the trace corresponding to the Sentra structure will be a infinitely long string. 
Since the control structure can be implemented only if the trace can be generated 
using a finite program, it would be very awkward Ht the onty feasible control 
structures were acyclic. Fortunately, It ean be proved that if any feasible control 
structure exists, then there exists weedenle contro! structure. that fires. the blocks 


In some cyclic order. 


3.1: : Existence of Cyclic Control Structures 
The theorem proved in this section can be stated as: 
Suppose there exists a string «< hel a SE > such that. satisfies 
the reattime constraints. Then there also exists a finite string 8 such that 
the string 6” also satisfies the reabtime specifications... 
This theorem will be proved using several lemmas. 


7 Definition: A critical window of a ‘control structure # for the constraint Cc; is a 


substring #, =a, ---@,, of w that contains two occurrences of C,, but 


[¥ i cantains no occurrences of C i° 


The most critical window for C, is the critical window with the greatest 
Lemma 3-1: The string » satisfies the latency specifications for C i ff and only if 
l¥, |<1, for the. most critical window ¥,. in. w#. 
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Proof: 
only if: Assume « satisfies the realtime specifications. Then any substring 
of « of weight | F contains C;. in particular, the substring 


[2 °** 44,44] of weight 1, must contain C,. Since [¥,] does not 
contain C,;, the substring {¥; of weight 1, -«, where « is arbitrarily small 
contains one occurrence of C,. Therefore, |¥,| <I,+s, ¢>0. 


If: Assume the most critical window #, has weight greater than I). Let 7 
be any substring of iv; where |7|=1,. 7 exists since: 

itv, 11 = |0,1-<>1,-« 
Since ¥, Is a critical window, then [¥,] contains no occurrences of C,. 
But 7 Is a substring of {¥,] and also dees not contain C,. Hence, 7 is a 
substring of w of weight 1 that does not contain the constraint path. 
Therefore, w does not satisfy the latency specifications. 8 


Corollary: Since v; contains two occurrences of c,, the period between 
successive occurrences of C, must be less than I, - {C, |. 


This lemma shows there Is a time limit between the starts of successive 


occurrences of C,. The bandwidth specifications directly fimit this interval. 


Therefore, it will be assumed that the latency specifications are more severe than 
the bandwidth specifications. if not, the latency specifications can be adjusted so 


that: 
is 3! i 
The time remaining until the start of the next appearance of @ constraint path is 
called the fax/ty of that constraint. Given a control steabtars: we can construct a 
table of laxities for each position in the corresponding String « with the property 
that the table entries are nomnegative if and only if # satisfies the latency 


specifications. The only difficulty is In accurately determining the start of an 


eccurrence of a constraint string. This will be handled by keeping laxities for the 
-26- 


Existence of Cyclic Control Structures Section 3.1 


tails of the constraint strings. The true laxity for a string will be refiected in the 
laxities of its tails if the start of the.constraint path Is falsely identified. 


An element of the table d[/,j,k] is the laxity for the path C; j just before ay Is 


fired. The table should be thought of as rectangular with columns labeled by 


elements of». The entries in the first column are: 


since the constraint path C, must occur by F -|c A j |. The remaining columns can 
be filled in by simple recursion rules. 
If the next element in » is not the same as the first element In a constraint 


path, the laxity for that path decreases by the weight of that element: 


ate, ,>d[l,Jk+1] = ALi, LkKF | ay | (3-2) 
There are two possibilities if the next element in the solution is the same as the 
frst element. in a constraint path. If this js the start of an occurrence of a 
constraint path, the laxity for the tail of that path should be no more than the 
current laxity for the constraint path. It [s possible that the tail will already have 
a more. severe laxity since different constraint paths can have identical tails. In 
addition, the laxity for the whole constraint path will become the original limit the 
Instant after the first element appears. Therefore, the laxity becomes the original 
‘laxity minus the welght of the first element. 


However, if a is pot the start of an occurrence of Cc; P the laxity should 
decrease by la, 1. Fortunately, this problem will be handled automatically by 
assuming that an occurrence of Cc, j starts whenever a =G; i If it is not part of 


an occurrence of Cc; jr; 
, 


j will appear again before all of Cc; j appears. When this 
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happens, the laxity for c, j+t will have decreased by the amount the laxity for 
C, , should have decreased If the start of the path had not been incorrectly 


identified. When G; j appears again, the laxity for C; j 1 will be less than the 


laxity for C LS Therefore: 


d[/,j+1,4+1] = min(d[i,j.4),a[1,/+1,4}-] a, |) 


°C1,4 > [afi Jae1] =1)-1C) 1-12 | (3-3) 


Equations (3-2) and (3-3) can be transformed to produce rules for computing the 


k+1st column of the laxity table from the Ath column: 


-1C; I=L | | fac; | 
ALi, jk+1] = | mind[s,j-1.4], HELAP la |) fame, 54 ‘S:4) 
a1, JAF ay | ida ahd had 


As an example, figure 3-1 shows the faxity tabfe for the contro! structure ABCD 


and the block diagram module from figure 2-3. 


In this table, the laxities at time 60 are identical to the laxities at time 30. The 
next column in the table would be identical to the column at time 40. The rest of 


the table becomes periodic, and all the entries are norrnegative. The periodicity 


allows us to prove that (ABCD)" will satisfy the latency specifications for all time. 
This Is formalized in the following lemmas: 


Lemma 3-2: If: 
Vj d[i,jm]2 d[/,j,k] and a, = a'n 


then: 
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t4710 tg-6 t,=10 ty =6 


Ing = 46 lep = 45 lap = 60 


Typical Laxity Table 
Figure 3-1 


te 


Ys) d[i,j.m+1]2 ¢@Li,/,k+1] 
Proof: From case analysis of (3-4) and elementary algebra. @ 
Lemma 3-3: Let: 
clas Ra 
ae ee 
yra,ct: 
If w = aby satisfies the latency specifications and: 
WV, WLI = fm) 
then: 
ve of fy 
= a',a'y eae 


also satisfies the latency specifications. 


Proof: Construct the laxity table d' for a’: 


Since a, =a',, (3-4) leads to: 
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V, OLJ.1) = dfi.4, 1] 
Similarly: 
Vi pds mtlidd] = d[i.Js 20 . » (36) 
Therefore 
vi, Jj G[i,j.m)= [jk] 


From lemma 3-2: - 
Vij G[i,j,m+1) = Ai jfA+1] 
= [/,J,kK+1}20 

Similar reasoning will show: 

Vj @[i,j,2m-k-1J2 ¢[i,j,m-1] 

= d[/,j,m-1]20 
Now 8 om-k =a_, so lemma 3-2 still applies: 
Ves ¢@[/,j,2m-k])2 d[/,j,m]20 


Inductively: 
V, jlzmeLliJd+m-k] 2q[/,js]20 (3-6) 
Combining (3-5) and (3-6):. 
al i$] 


Therefore, from lemma 3-1, #' satisfies the latency specifications. = 


Corollary: Let mays aa, Bua, 7 Anas and yu" a, ***. If «= afty 
satisfies ail the latency specifications and a[/,j.4]- d[/,j,m] for some 


k<¢m, then «f” also satisfies the latency specifications. The proof is by 
induction. @ 


The main theorem can now be proved by showing that any laxity table will have 
duplicate columns and applying lemma 3-3: 


Theorem 3-4: if any string w satisfies the latency specifications then there exists a 
string of the form A" which also satisfies the latency. specifications. | 
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Proof: Construct the laxity table for w. There are a finite number of 
possibilities for each table entry since each entry is 1, - |C; | minus a sum 


of a finite number of la, |’s. The number of different |a, |’s is limited by 


the number of blocks in the block diagram schema. The number of terms in 
the sum must be finite since each Ja, | is greater than zero and the laxity 


entry is also greater than or equal to zero. Therefore, the possibilities for 
each column are limited and eventually some column in the table will be 
repeated and Kk and m satisfying the conditions of lemma 3-3 exist. 


Applying the corollary to lemma 3-3 says a solution of the form a” exists. 
However, d[i,j,1] = 1,-1¢; j } 2 d[/,j,k], for all k (the rules for filling in the 


table never increase the laxities except to set d[/,j,k] to 1-1¢; 1. 


Applying lemma 3-2 shows that " is also a solution. # 

The major implication of this theorem is that only cyclic strings need to be 
considered for static control structures. These strings can be enumerated, so the 
problem of finding a static control structure is in principal solvable. Since the proof 
also places an upper bound on the length of the cycle (equal to the total number of 
possible laxities at any position), so an algorithm that generated all possible strings 
would be effective in the sense that it would always halt In a finite amount of time. 
However, It would require computation time that grows exponentially with the 
complexity of the schema, so the problem would be computationally intractable if 


this were the only algorithm. 


3.2: Generating Real-Time Control Structures 

The problem of generating a feasible control structure is a scheduling problem. 
The problem is deterministic since the parameters of the problem are strictly 
bounded as opposed to being unbounded random variables. A wide varieties of 
special cases of the general scheduling problem have been studied, and some 


results are surveyed by Gonzalez [7], though relatively little work has been done 
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on scheduling in the presence of deadlines. 

Gonzalez and Soh. developed a simple algorithm that minimizes the number of 
processors used to schedule independent tasks. The tasks are statically assigned 
to processors and always run to completion. The deadlines. for: each task 
correspond ‘a the period of the requests for that task and must: be @ power of 
Sure Their algorithm is not optimal if the periods are not a power of two and no 
optimal algorithm is known, although | socecal : heuristic algorithms have been 
investigated. . 

Liu and Layland considered the. problem of scheduling independent tasks on a 
single processor [14]. Each task requests service periodically with a deadline for 
service coinciding with the time for the next request. They present a method of 
assigning’ static priorities to the tasks that. will meet the deadiines if any static 
assignment of priorities will, in addition, they prove the schedule which executes 
the task whose deadline is earliest is optimal in the. sense tt will meet the 
deadiines if any schedule will. They then prove necessary and sufficient conditions 
for a set of tasks to be scheduled by the earliest deadline (ED) algorithm to meet 
all its deadlines, and conclude that ED algorithm allows. 100% utilization of the 
processor as opposed to figures as low as 70% for static priority algorithms. 

Geiger extended the proof of the optimality of ED scheduling to include the case 
were the requests are not periodic [6]. Fiala presented the same basic proof and 
also derived necessary and sufficient conditions ae the ED sotedilee with a mix of 
periodic and aperiodic tadke [5]. | = | 

Mok investigated scheduling independent tasks ‘on multiple identical processors 


[16]. Mok shows that no optimal aigorithm exists for this problem unless the 
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deadlines, computation times and at feast some future request times are known. An 
algorithm related to the ED algorithm is presented which ts shown to be optimal if 
all requests are simultaneous. This algorithm executes thase. tasks with the least 
laxity, where the /axity ofa task is. the deadline for the. task minus its remaining 
computation time. Unfortunately, both the least laxity: and ED schedulers are shown 
to be ‘Gonroptinal even for tasks with periodic requests. However, the least laxity 
scheduler Is optimai for periodic deadiines where tasks may be executed at any 
time (le. if the deadlines are coincident with the next request, the least laxity 
achediser is optimal If it is allowed to execute tasks before they have been 
tequested): 

The problem of scheduling tasks related by a partial order on multiple identical 
processors has been studied by Manacher [15]. Deadiines are specified for any or 
all tasks in the system. Manacher’s algorithm derives deadlines for all tasks In the 
system by using the observation that a task must complete executing in time to 
allow its successors -to executed before their deadlines. The scheduler then 
executes those téice with the earliest | deadiines that have had all their 
predecessors executed. This algorithm is not optimal, and does not consider either 
periodic requests or multiple start-times. However, it is a reasonable heuristic, 
especially as the number of processors increase. 

Unfortunately, none of these results generalize to the static control structure 
problem, even for a_ single processor, although Sanisl structures could be 
constructed which would meet the conditions of the particular special case and 
satisfy the reattime constraints. For example, If the block diagram consisted of 


unconnected (independent) blocks, the eafliest*deadiine scheduler could be used 
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with task / being block / and the request period for each task being the minimum of 


4, /2 and.1 /B;. The period between. requests. would heve.to be. less than iy 4/2 


since (in the absence of other information) it is possibile for the task to be 
executed immediately after one request ard immetidtely’ before the following 


deadline. Lemma 3-1 says this time interval must not be greater than 1,. 


On the other hand, these heuristics areliable to be overly restrictive, particularly 
since they tend to deal with independent pees It would be possibie to derive 
independent tasks from a block diagram schema by treating the constraint paths as 
independent, but at the cost of introducing new blocks and much unecessary 
’ computation. One promising approach for deriving a static control sinicture is to 
simulate some more general control structure until a cycle in the trace of that 
control structure is found. An obvious choice of a more general control structure is 
a least laxity scheduler (using laxities as defined for block diagram schema) which 
follows the partial order for the tasks (blocks) based on the constraint paths. 
More precisely, the scheduler would build a laxity table, with starred entries 
indicating constraints strings which ‘cannot be frrad hecaiea of the partial order. 
The scheduler chooses the first block of the net rred Goce uand string with the 
smallest laxity to head the next column. If tans constraints have the same laxity, 
either can be fired next. Figure 3-2 shows such a laxity table for the block 


diagram schema from figure 2-3 using the same latency specifications as figure 3-1. 


At time 40, none of the latency specifications have been violated. However, 
since there are now two constraints with laxity 0, at least one entry in the next 


column will be negative. By firing C at time 10, an. additional request for C is 
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laB = 46 lop = 45 lap = 89 


Counter-Example to Least Laxity Scheduling 
Figure 3-2 


created with deadline 50. In the control robotics environment, the existence of 
this request makes scheduling impossible. However, if 8 is fired and C is delayed 
until tine 15, the additional request also gets delayed to a point where it is 
possible to schedule all the requests. The least laxity algorithm simply does not 
deal with interactions between requests and deadlines. 

It is Interesting to note that the least laxity scheduler fails for this even If the 
constraint path AD is ignored. The remaining constraint paths AB and CD are 
independent, yet they cannot be scheduled using the ED algorithm using the worst- 


case period of 1, /2. If periods are kept at F - |c; |, the tasks still cannot be 


scheduled by the ED scheduler if the individual blocks are scheduled separately. 
The failure in this case can be viewed as an inability of the ED scheduler to derive 
the proper phase relation between the tasks. 

The schedule shown in figure 3-3 is not the only least laxity schedule. For 
example, at time 25 CD has the same laxity as B and therefore C could be fired 


instead of 8B. However, the reader can verify that all the least laxity schedules for 


-37- 


Generating ReatTime Control Structures - Section 3.2 


this example fail to satisfy the latency specifications. 


3.3: A Branch-and-Bound Method for Genarating Control Structures 

Rather than generating ee control structures and looking for a cycle, the 
algorithm described in this section works by~ generating a cyclic contro! structure 
that satisfies the reattime specifications for. one of. the constraint paths. The 
sania for other constraints paths are combined to’ form a control structure that 
satisfies all the reattime specifications. - The basic semantics: of firing blocks rules 
out control structures that are not shuffles of the constraint paths since these 
control structures perform redundant computations. Therefore, this algerie should 
not miss any solutions. There are two major problems that the algorithm has to 
deal with: (1) How many times must each constraint path appear in one cycle of 
the total control structure. (2) How should the constraints paths be combined into 


one cycle. 


3.3.1: Determining the Relative Frequency of Constraint Paths 

The first step in the algorithm is to determine how many times each constraint 
appears in one cycle of the total solution... Upper and jower bounds can be derived 
from the length of the cycle and the basic latency, specification. Consider the 


lower bound on the number of appearances of constraint /: let k, be the number of 


appearances of C, in one cycle of the solution «”. Let w, = I¢;] and c = je]. 


Since the latency specification for C, requires C ; to appear at least once every 
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i-W, time units: 


c 


1,-w, 


k, 2 (3-7) 


This leaves c (the length of the cycle) to be determined. However, if Cc; appears 
k; times: 


c2kw, (3-8) 
More precisely, the aigorithm starts with the assumption that each block and 


constraint appears once and that c ==t,. This approximation is used to derive k, 
A 


for all constraints in the schema. If any k, increases, this is used to update the 


minimum number of times each block in the constraint must appear, which in turn 


may cause c to increase. This process continues untii all k; are consistent with c. 


In practice, this only takes a few iterations. 

Theorem 3-4 places an upper bound on the number of blocks in a cycle, but this 
bound is not directly applicable to the branch and bound algorithm since the 
branch-and-bound algorithm does not try all cycles of a given length. An upper 
bound on the number of appearances of any constraint can be easily derived if the 
number of appearances of the other constraints is held constant. 

First, an upper bound on the length of a cycle can be derived by applying 
equation 3-7 to all constraints except constraint /. Then the minimum weight of a 


cycie containing kK, appearances of C. can be computed for all {4 j. Letting c 


I J max 


be the maximum allowed cycle weight and c be the minimum cycle weight (not 


including constraint /), the minimum weight of a cycle containing k; appearances of 
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Cc, is: 


e +kw; (3-9) 
Therefore, the upper bound on k j ean be derived by restricting the resultant cycie 


weight to be less than c mae 


k,s ae ae (3-10) 
This ignores the possibility of blocks In C 1 already appearing in the cycle as part 


of other constraints. However, including more appearances of ‘constraint i will 
eventually cause the minimum cycle length to exceed c fnax” 

This still does not bound the number of appearances for all constraints, since 
constraint i can appear more often if constraint / appears more often, etc. Placing 
an arbitrary bound on one constraint will also bound the number of appearances of 
all other constraints. For example, requiring at least one constraint to appear only 


once places a fairly tight bound on all constraint.. However, It is not true that a 


solution of this type always exists. An example is shown In figure 3-3. 


3.3.2: . Strategies for Combining Solutions 

Once the number of appearances per cycles of each constraint path is known, 
the constraint paths can be permuted to form a contro! structure which satisfies all 
the reattime specifications. Many of the feomnes. for ieprovng me: efficiency of 
‘branch-and-bound’ optimization algorithms can ie applied to this problem even 
though it is not an optimization problem. An “optiioation problem seeks a 
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lag s 11 
Taj s 16 
ID i < 7 

La s 10 


Control Structure: (ABFDECBFADEBFCF )" 


Block Diagram Where All Constraints Appear More Than Once 
Figure 3-3 


permutation of n objects that maximizes an evaluation function f of the. 
permutation. 


A ‘branch-and-bound’ algorithm for this problem generates permutations for a 
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subset of the objects and extends these permutations to larger subsets. The 
permutations to the subsets are called partial sluttana: and are arranged in a tree. 
Nodes in the tree correspond to partial solutions “and the descendents of a node 
are the extensions of that partial solution. ‘Branch-and-bound algorithms are often 
more efficient than direct enumeration since it is often unnecessary to examine the 
entire search tree. The key to pruning the search tree Is the dominance relation 
on nodes of the tree. The evaluation function f can be extended to arbitrary 
nodes of the search tree by defining the value of a nor-terminal node to be the 
maximum value of its descendants. Then node A dominates node 8 if and only if 
f(A) >f(B). The branch-and-bound algorithm may prune any subtree whose root 
node is dominated by some node of the tree that has already been explored. 

In general, the dominance relation for a particular ‘optimization problem cannot be 
computed without examining the entire tree. However, it Is often easy to compute 
some weaker relation. These weaker relations are usually referred to as 
dominance relations in the literature, so we will use the term strong dominance 
relation to refer to the dominance ‘relation that relates A to 8 if and only if 
f(A) > f(B). 

Branch-and-bound algorithm vary In. the order the tree is searched and how the 
dominance relations used to prune the earch tree. Kohler and Steiglitz classified 
branch-and-bound algorithms and initiated the theoretical study of dominance 
relations [13]. They demonstrated the surprising result that pruning based on a 

‘ gtronger dominance relation does not always ‘Improve the efficiency of the algorithm. 


However, ibaraki showed that stronger dominance relations do lead to more efficient 
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algorithms for several common classes of braneh-and-bourd algorithms [10]. 

Branch-and-bound algorithm as dofined by Kohler and Steiglitz also make use of a 
function g that places a upper bound on the vaiue-of f at each node. If Lis the 
maximum f(A) for igaf nodes A encountered, pruning subtrees with g(A)<i can 
only improve the efficiency of the algorithm. However, the upper bound function 
can es be. viewed as a particular dominance: relation. 

The control structure problem as stated is not an optimization problem. However, 
tt is still possible to define a dominance relation between nodes of the search tree: 
node A strongly dominates node B unless B ieads to a valid control structure and A 
dees not... Assuming the nodes at each level are. generated in a random 
Gexicographic) order, the best pruning for the algorithm to use is. to retain the node 
at each level which dominates the other nodes. If this dominance relation can be 
easily computed, the algorithm can generate. a valid control structure without 
backtracking. 

As a first step towards computing a deelhance relation, define the s/ack for each 
constraint to be the difference. between the lateney requirement and the latency 
actually achieved by the control structure. The constraint with the least slack is 
the most critical constraint (MCC). The slack.in the MCC could also be used as a 
value function to be maximized. If no control ‘structure satisfies the real-time 
constraints, the control structure maximizing the slack in. the MCC is probably a 
good ‘close’ solution. Also, the slacks may be used to evaluate any heuristic 
algorithms for deriving contro! structures. | | 


The latency achieved by a static control structure for a constraint Cc, is the 


weight of the most critical window for C;. Adding a block to the cycle of the 
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control structure cannot increase any slacks since: the. weight of some critical 
window willbe increased. The: only exception would be -if: the. new block completes 
an. additional occurrence of some. constraint path, thereby creating: new critical 
windows. This cannot happen if the blocks being added are elements of some 
other constraint. path, since no constraint path is contained: in. another constraint 
Gath: Therefore, the MCC sleck..can:.be weed as an: upper. bound function. in-a | 
branch-and-bound algorithm to maximize the MCC slack. Upper bound functions are 
also often used to guide the. search. in: branch-andbound eigorithms. For example, 
the algorithm could: always expand the node with the greatest upper bound: 

lf the slacks in each constraint are reduced by the -same. amount when a new 
block is added to the cycle, then. the partial.salution with the ‘greatest MCC slack 
would be a dominant solution. Unfortunately, .this is nat. generally the case. 


Consider dividing a cycle # of the control structure Into ragions $),, and Ey, as 


shown in figure 3-4. The i" regions contain one occurrence of C;; but [e] 
eoritale no DopUTanOS of C;- The critical ‘windows of & ere ¢, 14: itv 
Therefore, adding blocks to a §; j region increases es woah ¥;. D and adding 
blocks to a 41) region Increase the weight of. ¥)j-1 8nd F; 3. Even if {¥; 1 
increases, the slack for C, will not decrease unless |¥ hd |= max |, j j. The slacks 
? k ‘. 
can not be used to compute a dominance relation since the interdependence of 
constraint paths may force new blocks to be added within the most critical window 
of some constraint, while another solution with a smatier MCC slack might have a 


critical window of the right size in the right place. 
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Regions of a Critical Window 
Figure 3-4 


Keeping vectors of slacks for each constraint path oes not correct the problem. 
Consider the example shown of figure 3-3 with the latency specification as shown 
in figure 3-6. It can be easily verified that (ADEFCADBC)” is a feasible control 
structure for this schema. it is also the oaly feasible control structure’. AD and 


CF must appear at least twice in one cycle of the solution. Figure 3-5 shows 


slacks for this constraints for two partial control structures. The merging of 
(ADAD)” and (CFCF)” that leads to the solution is (ADFCADFC)”. However, the 
slacks for CF in (ADCFADCF * are larger and the slacks for AD are the same, so 


(ADCFADCF )" would dominate (ADFCADFC)” even though It doesn’t lead to a 


solution. 


3.3.3: Performance of the Algorithm: 

Assume each constraint path contains an average of k blocks. The slack of a 
constraint path in a trial cyclic solution can be determined in at most kK scans of 
the cycle. If there are n constraint paths there will be o(nk) scans of each trial 
solution generated by the algorithm. The trial cycles will be o(nk) blocks long (this . 
1. This was verified by checking all cyclic control structures that might be 


generated by a branct-and-bound algorithm assuming that the least critical 
constraint only appears once per cycle. 
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Control Slack 
Structure: 


Constraint 


ADFCADFC 
ADCFADCF : 


Counter-Example to Siack-as a Dominance Relation © 
Figure 3-5 


ignores the possibility of a constraint appearing several times in one cycle). The 
overail time complexity of the algorithm will be o(n 22) times the number of trial 
cycles generated per problem. 


Assume the trial cycle contains m, blocks eis the next constraint path contains 
m 2 blocks. There are (m,+m,-1)! cycles containing all the. blocks, but we aie 
only interested in one of the m,! permutations of the biocks in the old ue ane 
(m,~-1)! permutations of the blocks in the new constraint (i.e. we must consider 
m, different phase relations of the two cycles): Therefore, the number of differant 


trail cycles generated at this step Is: 


(m4+m5-1)! Peron) 
12 - "2 
m,m.-1)! aa | m, a (%11) 


Of course, if some blocks of the new constraint are already contained in the old 
cycle, or if the next constraint appears more than once, not all of the generated 
cycles will be distinct. However, it is rather difficult to avoid generating these 
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cycles. There will be relatively little extra cost to the algorithm as long as it does 
not Investigate cycles that are identical to cycles that have already lead to 
failures. Therefore, the number of trial cycles generated by the merging algorithm 


when it finds a solution without backtracking is approximately: 


Zk (fe") (3-12) 


Equation (3-12) is o(knk+1) since the binomial term in the sum is a(nk) and there 
are n terms. 
lf the merging algorithm fails to find a solution, then it must have backtracked 


through each trial solution and the tota! number of cycles generated is: 


(14k (PAT) ee re (PAT) 2) (3-13) 
which can be approximated: 
I k (aay) (3-14) 
i=2 
Equation (3-14) is o((kn*)”) or o(k"n*?), and is exponential in the number of 
blocks in the schema. This is a very loose upper bound and would only be 
achieved If all generated solutions were plausible except when the last constraint 
was being merged in. However, this bound is achievable if the first n-1 constraint 
paths had relatively large latency specifications while the last constraint path had 
relatively small latency specifications. This situation can be easily avoided by 
starting with the path with the smallest latency constraints relative to the weight 


of the path. 
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3.3.4: _ Speeding up the Algorithm 

There are many ways the average perfoxsmance of the algorithm could be 
improved. For example, if we bad a tighter lower bound on the slack in the MCC, 
we could prune more subtrees. We can get a tighter bound by determining what 
new blocks must be added to the control structure. Adding a new block always 
increases the size of some critical window ‘for a constraint by at least the weight 
of the block. Therefore, if the sum of the slacks for a constraint is less than the 
total weight of blocks that must be added to the control structure, at least one of 
the critical windows for that path will exceed the latency specification for that 
path. This tighter bound has no effect.on the. performance If no backtracking is 
necessary. However, if no solution is found, using the tighter bound is roughly 
equivalent to faducig: n, since fewer constraints need: to be combined before the 
control structure is recognized as infeasible. | 

Notice that the performance of the. algprithm would not be of polynomial 
compiexity even if there were a dominiahbe relation that totally ordered the 
possibilities at each level. The problem Is that the number of partial solutions that 
must be generated by a naive algorithm can Gros exponentially with the complexity 
of the schema. Therefore, finding a good dominance relation Is not as important as 
finding a search function that generates nodes: that pes post tkely to lead to a 
solution first. | | : i 
| Since the weight of the critical windows erase when new blocks are added, 
we might try merging In new éinatranit paths: so that no new blocks are added 
before trying more general mergings.. This. will improve the performance if the 


solution is an extension of this type of merging, even If the algorithm must 
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backtrack since fewer nodes are generated on that level. If the algorithm must 
backtrack through all the control structures of this type, the performance of the 
algorithm is somewhat worse. The effect of this heuristic may be approximated by 
reducing k, since the length of the strings merged into the current control structure 
will be reduced. 

The other way of improving the performance of the algorithm is to reduce the 
complexity of the problem. This can be done by replacing sub-graphs of the block 
diagram module with new biocks. Whenever the new block is fired, the blocks 
comprising the subgraph replaced by the new block are fired in some fixed order. 
This replacement can dramatically reduce k, and would improve both the best- and 
worst-case performance. However, combining blocks in this way can result in a 
schema which has no feasible control structures even though the original schema 
does. 

Since the process of generating a control structure can be so time consuming, It 
would be extremely useful to quickly identify realtime specifications that are 
Impossible to satisfy. One way of dcing this is to compute the percentage of CPU 
time required by each block. If the sum of this percentage over all blocks in the 
schema Is greater than 100%, the latency specifications are obviously unsatisfiable. 

The percentage of the CPU required by each block is easily computed: each 


constraint C,; must be executed at least once every 1,-|C; [+ time units. 


Therefore, each block Cc; j 


in C,; must be executed at least once every ,-|C; | +« 


time units and its corresponding CPU percentage is: 
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eam — (3-15) 
b-[C, [+e : : - 


If an block appears in several constraints, its CPU percentage As the maximum of 
the percentage implied by each constraint the block appears in. Using the 
maximum rather than the sum corresponds Pe assuming that each time the block is 
fired it will help satisfy all the constraints It appears in. Although this is not 
necessarily the case, it is a lower bound on the CPU usage. 

Another quick test for unsatisfiable latency specifications Is that the slack in 
each latency specification must be larger than the computation time for all biocks 
not contained in that constraint path. Otherwise, the £ portion of some critical 


window for that constraint will be too large (refer to figure 3-4). 


3.3.5: Practical Experience 

A branch-and-bound ‘algorithm simflar to the: one described above has been 
Implemented as part of a system for implementing continuous-time block diagrams on 
conventional micro-processors. The implementation runs on a PDP-11/70 under the 
UNIX timesharing system. The block diagram is described using an interactive 
graphics editor developed by John Pershing [18], The ‘branct-and-bound algorithm 
Is only responsible for choosing the order to execute the blocks. The object code 
for the block diagram Is producéd by a separate’ program. 

The program uses ail of the heuristics: mantioned. above. except. it does not 
combine sub-graphs into new blocks. The progtam.is able to find control structures 
to satisfy most latency specifications for smali block diagrams using less than a 
minute of CPU time. So far, only one set of latency constraints has been found 
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where a valid control structure exists but no control structure was found by the 
program (see figure 3-3). Some latency specifications require more time to find a 
valid control structure. 

In the absence of a fast optimal algorithm, it is preferable to have a fast 
algorithm which yields ‘good’ control structures quickly. Heuristic algorithms are 
generally evaluated one of two ways: one approach chooses a fixed algorithm and 
derives an upper (or lower) bound on how far the algorithm’s solution is from the 
optimal solution. For example, Graham's algorithm for scheduling independent tasks 
on multiple processors executes tasks which require more processing time first. 
The resulting schedule is no more than 4/3 times as long as ecunai schedule 
[8]. 

The other approach develops a family of algorithms each requiring polynomial 
time. As the degree of the polynomial increases, the solutions found by the 
programs are closer to optimal. The family of algorithms ‘s monotonic in the sense 
that the an algorithm taking more time never produces a poorer solution than one 
taking less time. If the degree of the polynomial were increased to infinity the 
algorithm. would be optimal. However, it would also no longer be polynomially time 
bounded. An example is a serles of scheduling algorithms employing limited 
lookahead [1]. 

The ganund approach does not seem applicable to the contro! structure problem. 
Limiting the breadth of back-tracking yields a family of exponential time algorithms 
with the exponent increasing with the amount of back-tracking. A family of 
polynomial algorithms would result if at most k blocks were merged at a time with 


no backtracking. However, these algorithms are very unsatisfactory if any 
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constraint must appear more than once. If the number..of blocks in the censtraint 
path is less than k, than all blocks for the second: (and subsequent) appearance of 
the constraint will be merged coincident with the existing..cccurrences of those 
blocks. If k is increased so this does not happen, the performance of the-aigorithm 


is only slightly better than the complete algorithm with:no backtracking. . 


3.4: Heuristics for Generating Control Structures 
Steve Ward has éiparmented with some quick, simple heuristics for generating 


static contro] structures. Basically, the heuristic constructs control structures of 


the form (aferyed « - - where a Is the most critical constraint path and 8, 7, 3, et 
cetera are taken from the other constraint paths. More specifically, blocks from 
the next most critical constraint are added to # with the restriction that |afe| is 


less than |,. If more blocks remain in the constraint they are added to 7 so that 
Jeya| Is less than 1,- Once all constraints have. been .werged. in this way, -the 


fatency specifications are checked: fH they are ‘alfiedtisfied then the generatéd 
string ts a feasible control structure: 
- The heuristic will also’ call Itself using the current solution ase so the generated 


solution may also be of the form: 


Cader Katey 

Since these heuristics construct ‘a contro! structure rather than search for one, 
they run very quickly. However, they also do not find “solutions to a fairly large 
number of latency specifications, even for simple block diagrams. Still these 
heuristics are more attractive as a basis for an approximate algorithm, not only 
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because of their speed but also these heuristics could be extended to handle 


particular styles of block diagrams as the process of constructing control 


structures becomes better understood. 


-53- 


4: Static Priority Interrupt Control Structures 

- In some applications, the tokens at the input links do not change continuously. If 
the contro! structure can detect when an input changes, the reattime performance 
can be improved. Intuitively, this. Is : possible since if no inputs toa block have 
changed, that biock does not need to be executed. On the average, this type of 
control structure ought to do less computation and therefore ought to have better 
reat+time performance. On the other hand, better average performance does not 
guarantee better worst-case performance and specific questions of performance 
must be answered with respect to a particular model. 

Although the prototypical example of a dynamic control structure is interrupt 
driven, it is important to realize that hardware Interrupts are not necessary. For 
example, a control structure could sample the Inputs until one or more inputs 
change. After all the computation initiated as a resuit of these changes had 
completed, the contro! structure would continue to sample the inputs. In general, 
such a scheme would risk missing changes in the inputs. However, the control 


structure can use the reattime specifications to guarantee thie will not happen. 


4.1: Dynamic Control Structures 

Many of the strategies for scheduling independent tasks to satisfy reattime 
constraints mentioned in the previous chapter use dynamic control structures. For 
example, Liu and Layland use static priority interrupts and consider the case (in our 
terms) where the latency Is equal to the period between requests [14]. They 
consider the earliest deadline scheduler only in this context although the earliest . 
' deadline schedule is optimal for any sequence of requests and deadlines, as 
mentioned earlier. 


Given an optimal scheduler, is there any reason to consider a suboptimal 
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scheduler? The answer will be yes if a good suboptimal scheduler exists which 
uses less resources than the optimal scheduler. The earliest deadline scheduler 
needs to find the highest priority task to execute whenever a task completes 
(alternately, it needs to insert requests into the proper position in a task queue). 
A static priority interrupt control structure also needs to find the highest priority 
task to execute. However, this is done in hardware by many existing computers, 
including current microcomputers. Also, the earliest deadline scheduler requires a 
reattime clock to compute the deadlines for each task from the request time and 
the latency specification. Therefore, static interrupt control structures are 
sufficiently simpler than a earliest deadline control structure to deserve further 


consideration. 


4.2: Model for Static Interrupt Control Structures 

A static interrupt control structure associates a task with each block in the 
diagram. The tasks are related by a precedence relation consistent with the block 
diagram. Each task has a priority and may be /die, active, or requested. The 
priorlty may be thought of as an integer with numerically greater priorities being 
better. 

When an Input changes, all tasks whose blocks are watchers of that input 
become requested. The control structures chooses the task with the highest 
priority among the requested tasks. This task is active until the block complete 
executing when all its successor tasks become requested and the task Itself 
becomes idle. If the control structure allows active tasks to be suspended while 


another task is executed the control structure is call preemptive. Otherwise it is 
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non-preemptive. Unless otherwise noted control structures -are- assumed to be 
preemptive. 

The tatency performance of any static Interrtipt control structure can. be 
determined for each task by adding the computation “time for: that task to the. 
maximum computation time used by higher priority tasks while the task Is on the 
ready queue. The difficutty in this analysis Is in detérmining how much computation 
might be used by other tasks. 

The simplest case to consider is when alf the tasks are Independent (each task 
consists of exactly one block). Each task ? requires t, units of computation; and 
has priority p j» latency I, ana bandwidth B,. ‘Without ioss of generality, the tasks 
can be numbered so that: | 

Py2Po2°- 
The overhead of associated with Interfupts, selecting a task for execution, etc. 
will be ignored for the time being. We shall also assume that all priorities are 
distinct. | 

The latency for task / when Its inputs change discretely is simply the maximum 
elapsed time between a change in an input and the termination of the task. This 
must be less than 1 If the latency specification for task / Is satisfied. The 
interpretation of the bandwidth specification is also simplified. instead of 
specifying a minimum rate for sampling inputs, the bandwidth ‘specifies the maximum 
rate at which an input changes. - | a i . 

The latency specification for task ij will be satisfied it and only if the biock for 


task / can be completely executed during any time interval of duration 1;- During 
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this interval, tasks with priority better than P; will also be run, and the amount of 
CPU time used by higher priority tasks must be less than 1, - t,. 

Notice that this model is equivalent to the model used by Fiala. Fiala’s P; 
corresponds to t), D, corresponds to 1) and T; corresponds to 1/ B;- Therefore, 


for a single processor we have the obvious restrictions: 


1 
i 
and: 
n 
z Bt; <1 (4-2) 
i=4 


The summands in (4-2) are the fraction of CPU time used by task /. Obviously the 
total fraction of the CPU used by all the tasks must be iess than one. Equation 
(4-1) can be derived from (4-2). 

Lemma 4-1; The amount of CPU time used by na independent tasks using a static 
Priority scheduler in a window of duration At does not depend on the 
relative priority of the tasks. 

Proof: The processor is always busy if some task is requesting service. 
Changing the priorities of the tasks will never cause the processor to 
remain idle when some task requests service, nor will it affect when the 
tasks request service. 


Since the control structure only executes a task if some input to the task 


changes, task / cannot be executed more often than once every 1/B; time units. 


Clearly, a task uses the maximum CPU time if any interval if it requests service at 
this maximum rate. 


Assume task / requests service at times OQ, 1/B;, 2/B;, ***, and let c(t) be 


the maximum amount of CPU time used by task i in the interval (0, t). The highest 
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priority task (task 1) always starts executing immediately after it requests service 
and executes for t, time units, so it will be executed [Bt] complate times in the: 
interval. Let r =t- [B,t] be the amount of time at the end of the window efter 


the last request for task 1. Task 1 will be executing during the interval (t-r, t) 


since task 1 has the highest priority. However, if -r.2t,, only t, units of 


computation will be used so: 


B _ 
] (4-3) 


C ,(t)= se]tyom|, t- 


B, 
The maximum amount of CPU time used by task 1 in the interval (At, t+At) is: 
C ,(t+at)-C , (at) 7 (M8) 
We will show that this is maximized when At = 0 by showing: 
C ,(t+At) -C (at) C ,(t) 
or — 
C4 (t+at)-C,(t)<C, (at) (4-6) 


Since the requests for task 1 occur with a regular perlad, C(t) is also periodic. 
In fact: 
Cg GF1/B,)= 64 (t) * 4; (4-6) 


Therefore, we need only consider At between 0 and 1/B,, in which case: 


c 4040) - mint, , At) (4-7) 
This Is the maximum amount of CPU time used by any Interval of duration At 


since the CPU time used cannot be greater than the duration of the interval nor 
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can It be greater than ty if the interval contains less than one period. Therefore, 


the inequality in (4-5) holds since the left hand side is the amount of CPU time 
used in an interval of duration At starting at t. 

The worst case for a set of tasks will occur when all tasks request service at 
time O and continue requesting service at their respective maximum rates. This is 
true since the highest priority task will use its maximum amount of CPU time under 
these conditions, and by lemma 4-1, any task can be made the highest priority task 
without affecting the amount of CPU time used by the set of tasks. 


Define C, (t) by: 


C,(t) = [B,t|t,+min tj, t-— 
i 

The amount of CPU time used by tasks / and & is not necessarily c,(t) summed 

over j/ and k. The difficulty is that if requests for tasks j and k occur sufficiently 

near the end of the window and of each other then only the higher priority task will 

actually be executed. Therefore, it is necessary to determine a precise schedule 

for the interval from 0 to t. However, if we are only interested in how much CPU 

time Is used in this interval, lemma 4-1 assures us that we may assign arbitrary 
priorities to tasks j and k. 

However, a sufficient condition for satisfying the latency specification for task / 


is: 


i-1 
,2t,+ 2 c,l,) (4-8) 
j= 
This equation can be made more intuitive if the time required by task / is 


approximated by: 
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Then equation (4-2) becomes: 
i~1 
| a Ped (4-10) 
This can be rewritten as: 
t, 
I, 2 7-4 (4-11) 


The denominator in equation 4-11 represents the fraction of CPU time available to 


task i. The effect of higher priority tasks is equivalent to reducing the CPU speed. 


4.3: Assigning Priorities to Independent Tasks 

One of the weaknesses of traditional reattime operating systems based on 
static priority scheduling is that the system does not verify that the priorities 
assigned by. the user are consistent with his realtime specifications. Even if the 
system checked these specifications, the user stil must. assign priorities, which do 
not have a simple relation to the realtime specifications. The.obvious strategy of 
assigning the highest priority to the task that requires the fastest response time 
does not work. Consider the example in figure 4-1. Either task 1 or task 2 Sei 


run at the best priority since i; 2 t,. if P;™ 1/ L, then-p, > Pp and the the latency 


for task 2 is: 
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['284 | | 


t+ 1,8, It, +mn| 43 aa ie 


16 : 16 
12+ | Pr 2+ min[2, 16 | ri [4] 
=12+8+ min(2,0) 
=202£1,= 16 


However, the latency for task 1 if P> > Py is: 


1B 2| 
1-2 
t,+ [',Bo] tp + min ., ae ah 
15 , 16 
2+ |22| + min| 12, 15- |32| 24] 
= 2+ O+min(12, 15) 
=14s1,=15 


Counter-example to priority = 1 / latency 
Figure 4-1 


The algorithm successively finds a task that can satisfy Its latency 
specifications while designed the lowest priority. If there are several such tasks, 
choose one arbitrarily. This task is assigned the lowest priority and removed from 
the set of tasks. The next task selected will be assigned a priority higher than all 
previously assigned priorities but lower than all tasks still unassigned. This 


continues until no task remains or no task can be found that can execute at a 
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priority lower than all other tasks. In this case, no sesighaent of static priorities 
will satisfy all the latency specifications using Bale Ohta processor. This algorithm 
will never make a bad choice. Consider the. situation ‘when one or more tasks 
remain yet no task can be assigned the lowest priority. Any task that could 


possibly run at a lower priority has already been.assigned.a lower priority. 


4.4: More Complex Models 

The mode! for static interrupt control structures made several simplifying 
assumptions, such as ignoring scheduling overhead, assuming preemptive scheduling 
and distinct priorities. The model can be easily changed to account for different 


assumptions. 


4.4.1: Scheduling Overhead 

When a task requests service, the control structure must compare the priority of 
the task with the priority of the currently executing task. If the priority of the 
current task is higher, then new request must be queued In some manner. When 
any task completes execution, the control structure must select a new task to 
execute. Also, switching the processor between tasks will generally involve 
setting up some processor registers. However, all of these actions wifl occur for 
every Instance of a task requesting service, so these. overhead costs can be 


included in the maximum CPU time used by task / = t,. The basic algorithm of 


finding a task which can be assigned the worse priority while still satisfying (4-6) 
is still correct. : A202. os 
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4.4.2: Normpreemptive Control Structures 

if the currently executing task always runs to completion before a new task is 
run, then the latency specification for a task must: be large enough to aitow for any 
task with worse priority to execute as well as the CPU time used by tasks with 


better priority. Thus, (4-6) becomes: 


i-1 n 
l,2t,+ 2 C,(1.)+ max (t,) (4-12) 
ei per FO pear 
Again, the assignment algorithm does not require any changes. This is obvious if 
the algorithm finds a valid assignment of priorities. increasing the priority of some 
task relative to task / moves a task into the summation term in equation (4-12). 


Since C J (t) is greater than or equal to t,, making this change can only increase 


the right hand side of (4-12). 


4.4.3: Nor-Distinct Priorities 

For various reasons it may be desirable to assign severa! tasks Identical 
priorities. For example, the computer hardware may only support a limited eer 
of interrupt priorities. Since the control] structure is free to execute any of the 
requested tasks having the highest priority, all tasks having the same priority as 
task / must be treated as if they had higher priorities when checking the latency 
specifications. This assumes that the contre! structure only executes task + when 
all other requested ‘tasks have: priorities strictly worse than: P)- 

However this also makes the often unrealistic assumption that a task can be 


preempted by a task with equal priority. If this Is not the cage It Is necessary to 
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simulate the control structure on the worst case sequence of requests. It is not 
sufficient to treat these tasks as if they had lower priority but ara not preemptible 
since a pair of tasks can make a sequence of requests so that one of them 
requests service again while the other is being executed. Therefore, the frst task 
can be executed twice while task / Is. waiting for service although tesk / Is never 


preempted. 


4.6: , Applications to the Control Structure Problem 
Verifying the reattime performance of a static priority _ scheduler on more 
complex task structures is a straightforward extension of the verification for 


independent tasks. A latency specification 1, is satisfied if and only if all blocks in 
the constraint path can always be executed during any interval of duration \,- It 


becomes slightly more complex to compute the amount of CPU time used by higher 
priority tasks since some tasks (blocks) will not be runnable when other tasks are 


requested. 


4.6.1: Chains of Independent Tasks 

If no block appears in more than one constraint path, the constraint paths can 
be treated as independent tasks. A task will never be interrupted by a request of 
a predecessor if the raabtime specifications are.met since the .period between 
requests Is not less than the deadiine for any one request. 

The priority assignment problem would be very much more difficult if it were 


necessary to consider assigning different priorities to individual blocks in a chain. 
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However, it does not make sense to assign lower priorities to some biocks In the 
constraint path, since it makes no difference’ where in. the chain higher. priority 
tasks are atiowed to interrupt... Therefore, all the tasks. in the chain can be 
assigned the same priority as the task in the. chain with the least priority. 

In the presence of overhead it Is more efficient ‘to create :one ‘super-task’ that 
executes all the biocks consecutively rather than incurring the overhead of a 
request for each block in the chain. iawevae. if the control structure is non- 
preemptive It may be necessary to create Severe smalier ‘super-tasks’ to reduce 
the amount of time that must be spent walting for low priority tasks to complete. 
Deciding how many tasks to create and how large to make them could be made on 
the basis of pow. much CPU time needs to be freed up” in order to find a task to 


eee? the currently worst priority. 


4.6.2: More Compiex Task Relations 

There are fundamentally two ways different constraint paths can have a common 
block: the common block can have. more than one successor or It can have more 
than one predecessor. We will first consider the simplest example of each type of 
interdependent constraints. 

Consider a block diagram in which block A has successors B and C. The 
constraint paths for this diagram are AB and AC. Since a request for A will always 


cause requests for both 8 and C, Ba, =By.. Therefore, neither 8 nor C will be 


interrupted by requests for A as long as the reattime specifications are met. 


Now, If p, > P, then the sequence of blocks executed whenever A is requested 
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is ABC. Otherwise the sequence AC&. will be executed. We can therefore repiace 
the tasks A, 8, and C by a task that executes either ABC:or ACB. The latency 
specification for the new. task should be chosen so that ijt will be satisfied if and 
only if the original latency specifications are . satisfied.. These latency 


specifications are satisfied if and only. if: 


lag 2 ta+t,+(time lost to interrupts) | (4-13) 
and 

lac 2 ty*te +(time lost to interrupts) | (4-14) 
The CPU time used by interrupting tasks will be identical for both the ABC and ACB 
sequence, except if ABC Is executed, then B must be considered an interrupting 


task In equation (4-14), and similarty for C and equation (4-13). Therefore: 


lage =minlag, lac -tg) (4-16) 


and 


lace "lac Nag-te) (4-16) 
and we should choose the sequence that yields the greater latency. 

Now consider a block diagram in which C has two predecessors A and B. The 
constraint paths for this block diagram are AC and BC. It Is aiso quite possible to 
receive a request for C while C is already requested or suspended. However, if c 
was first requested by A, the additional request will always be from B and vice 
versa. If this occurs the logical thing to do Is to have C executed only once, but 
in general the sequence AC will be executed whenever A Is requested and BC will 


be requested whenever 8 Is requested. 


It is sufficient to replace A, 8, and C by two tasks which executed AC and BC 
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respectively, ignoring the possibility that at times C may not need to be executed 
by one of the tasks. However, if no assignment of priorities Is found treating these 
tasks as independent, it is not necessarily true that no such assignment would 
exist if the common block C were handled more carefully. The difficulty is that the 


worst case sequence of requests becomes harder to construct. 


4.6.3: Combining Static and Dynamic Contro! Structures 

Rather than having the processor idle when no tasks are requested, it may be 
possible to have the processor executing a static control structure for some 
portion of the block diagram. In this case we would consider the static control 
structure to be the lowest priority task. There are no real-time specifications on 
this task in the usual sense, although we must still guarantee the latencies in the 
static contro! structure. This can be done by modifying the latency specifications 
so that even when the maximum amount of CPU time is used by the dynamic tasks, 
the static control structure still runs often enough. 


Consider a latency specification |, for C,. The blocks in Cc; must be executed 
once in every interval of duration V- The trace of the processor is no longer 


completely determined by the static control structure since the dynamically 
scheduled tasks will nteicapt the static control structure. However, the amount of 
CPU time used by these tasks is known. Therefore, we need only choose new 
latency specifications for the statically executed constraints according to the 


following equation: 
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! Cc (,) (4-17) 


k 
‘=|.- & 
PO aoa 


i= 
Where constraints 1 through k are executed by the static priority interrupt control 


i 


structure. 
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6: Summary and Conclusions 

We have presented a model for reattime computations that provides precise 
definitions of realtime performance. The model has. the additional advantage of 
atrongly corresponding to intuition. This makes the. model.ideal for defining the 
semantics of a reaktime programming language. The model also avoids close 
association with any implementation. Therefore,: the modet-is. applicable to a wide 
variety of systems. Conversely, a language based:on this model should be easily 
implementable in a wide variety of ways, without encountering features of the 
model too finely tuned to a particular implementation. 

Several strategies for implementing control structures for block diagram systems 
were investigated. The first strategy was to find a-static execution order for the 
blocks in the diagram. Control structures of this type have been somewhat ignored 
for time critical applications. An. important result is. that any such control: structure 
could be represented as a finite cycle, although the bounds on the length of the 
cycle are so large that explicit enumeration is impractical as .a synthesis technique. 
A branch-and-bound synthesis. method: was developed,: but sntortinately it is atso 
impractical for large problems. We suspect that the aynthesis probiam is. NP- 
complete (computationally intractable), but have. net préved this conjecture. in any 
ease, we.believe it is more promising to investigate fast heuristic algorithms for 
synthesizing static control structures. 

The next general strategy Investigated made use of the fact that in many 
applications the input values change at discrete dimen: Under. this assumption, 
block diagram schemata are closer to traditional models of .reattime computations. 
Previous research- has found optimal schedulers. for the special case of one 
processor and independent tasks.. However, eimpier: static priority schedulers had 


been ignored except for the special case-of the tatency. specifications being 
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Identical te the bandwidth period. We developed an efficient aigorithm for assigning 
priorities to independent tasks when the latency specification is less than the 
bandwidth period. The synthesis techniques were modified: to construct control 
structures for block diagram schemata ‘in which the: biocks ‘were. not independent. 

Since the analysis of the reai-time performance of biock diagram schemata under 
a static- priority control structure is sim#ar:-to the analysis of static priority 
queueing systems, the priority assignment -algorithm can also be applied to priority 
queueing systems. 

Finally, we discussed some of the issuss -that arise. when more than. one 
processor {ts avaltable to the control structure: The realtime. performance of 
multiprocessor contro! structures was. anatyzed, and absolite. bounds on the reat 
time performance for a block diagram schema were: derived. .if the real-time 
specifications can be met by a muttiprocessor contre! structure, the objective 
becomes minimizing the number of processors: needed to implement a feasible 
control structure. Several special cases are known to: be NP-complete, so the 
genera! problam is also NP-complete. However, there is reason to believe that 
simple aigorithms will produce control structures: using a ‘number of. processors. that 
differs from the minimal number by a bounded factor, afthaugh no specific algorithns 
were investigated. 

Future work should probably concentrate on-elther proving various synthesis 
probiems to be NP-complete: or finding. efficient. algorithms. in. the event the 
problems are intractable, the performance of: efficient heuristic algorithms should be 
studied. Certainly any Implementation of a practical jenguage system based on 


block diagram schemata. should attempt to fied and improve such heuristic methods. 
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A practical system should also attempt make use of more of the special cases for 


which efficient algorithms are known. 
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